Toward high-performance key-value stores through GPU encoding and locality-aware encoding

نویسندگان

  • Dongfang Zhao
  • Ke Wang
  • Kan Qiao
  • Tonglin Li
  • Iman Sadooghi
  • Ioan Raicu
چکیده

Although distributed key-value store is becoming increasingly popular in compensating the conventional distributed file systems, it is often criticized due to its costly full-size replication for high availability that causes high I/O overhead. This paper presents two techniques to mitigate such I/O overhead and improve key-value store performance: GPU encoding and locality-aware encoding. Instead of migrating full-size replicas over the network, we split the original file into smaller chunks and encode them with a few additional parity codes using GPUs before dispersing them onto remote nodes. The parity code is usually much smaller than the original file, which saves the extra space required for high availability and reduces the I/O overhead. Meanwhile, the compute-intensive encoding process is largely accelerated by the massive number of GPU cores. Yet, splitting the original file into smaller chunks stored on multiple nodes breaks data locality from application’s perspective. To this end, we present a locality-aware encoding mechanism that allows a job to be dispatched as finer-grained tasks right on the node where the required chunk resides. Therefore, the data locality is preserved at the finer granularity of sub-job (i.e., task) level. We conduct an in-depth analysis of the proposed approach and implement a system prototype named Gest. Gest has been deployed and evaluated on a variety of testbeds demonstrating that high data availability, high space efficiency, and high I/O performance could be collectively achieved at the same time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Achieving Data-Aware Load Balancing through Distributed Queues and Key/Value Stores

Load balancing techniques (e.g. work stealing) are important to obtain the best performance for distributed task scheduling system. In work stealing, tasks are randomly migrated from heavy-loaded schedulers to idle ones. However, for data-intensive applications where tasks are dependent and task execution involves processing large amount of data, migrating tasks blindly would compromise the dat...

متن کامل

locality aware partitioning schemes for large - scale data stores

Key-value stores are widely recognized as scalable systems with good performance, and are the backbone of several large-scale storage deployments. However, their interface is rather restrictive since it only allows to access objects through their keys. To address this problem, recently proposed systems have developed mechanisms for storing data by mapping it into multiple dimensions, in order t...

متن کامل

Collaborative inter-prediction on CPU+GPU systems

In this paper we propose an efficient method for collaborative H.264/AVC inter-prediction in heterogeneous CPU+GPU systems. In order to minimize the overall encoding time, the proposed method provides stable and balanced load distribution of the most computationally demanding video encoding modules, by relying on accurate and dynamically built functional performance models. In an extensive RD a...

متن کامل

طراحی و ارزیابی روش کدگذاری ترکیبی برای کانال پوششی زمانبندی‌دار در شبکه اینترنت

Covert channel means communicating information through covering of overt and authorized channel in a manner that existence of channel to be hidden. In network covert timing channels that use timing features of transmission packets to modulating covert information, the appropriate encoding schema is very important. In this paper, a hybrid encoding schema proposed through combining "the inter-pac...

متن کامل

Address-free memory access based on program syntax correlation of loads and stores

An increasing cache latency in next-generation processors incurs profound performance impacts in spite of advanced out-of-order execution techniques. One way to circumvent this cache latency problem is to predict load values at the onset of pipeline execution by exploiting either the load value locality or the address correlation of stores and loads. In this paper, we describe a new load value ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 96  شماره 

صفحات  -

تاریخ انتشار 2016